Dr. Kam Tin Seong Assoc. Professor of Information Systems (Practice)
School of Computing and Information Systems, Singapore Management University
20 May 2025
What will you learn from this lesson?
Visual Analytics for Knowledge Discovery
Visual Analytics Approach for Statistical Testing
Visual Analytics for Building Better Models
Visualising Uncertainty
Variation and Its Discontents
Visually Analytics for Knowledge Discovery
Motivation: To combine data visualisation and statistical modeling.
Visual Statistical Testing
To provide alternative statistical inference methods by default.
Visual Statistical Testing
To follow best practices for statistical reporting.
For all statistical tests reported in the plots, the default template abides by the APA gold standard for statistical reporting. For example, here are results from a robust t-test:
Two-sample means
Boxplot revealing the mean and distribution of two samples.
Boxplot with two-sample mean test
Visually-driven Correlation Analysis
Scatter plot showing the relationship between two continuous variables.
Scatter plot with significant test of correlation.
Visually-driven Association (Independent) Analysis
Mosaic plot showing the association between two categorical variables.
Stacked bar chart with significant test of association.
Visual Analytics Approach for Building Exploratory Models
Model Diagnostic: checking for multicolinearity:
Conventional statistical report
Visual Analytics approach
Visual Analytics Approach for Building Exploratory Models
Model Diagnostic: Checking normality assumption
Model Diagnostic: Checking model for homogeneity of variances
Visual Analytics Appraoch for Building Exploratory Models
Analysing model parameters
Conventional statistical report
Visual Analytics approach
Visualising Uncertainty
Why it is important?
One of the most challenging aspects of data visualization is the visualization of uncertainty.
Source: Chart 61, LABOUR FORCE IN SINGAPORE 2019, pg. 52.
Why one shouldn’t use a bar graph, even if the data are normally distributed?
It is not appropriate to displace average values on bars.
Why Error bar failed?
Each error bar is constructed using a 95% confidence interval of the mean.
Error bar on a dot plot
Each error bar is constructed using a 95% confidence interval of the percentage.